Imports

In [22]:
from pystock.portfolio import Portfolio, Stock
from pystock.models import Model
from pystock.FFF import FamaFrenchFactors
In [23]:
import warnings
import plotly.io as pio
pio.renderers.default = "notebook"

warnings.filterwarnings("ignore")

The FamaFrenchFactors class

This class is used to download and load Fama French factors. Start by creating an instance of the class.

In [2]:
fff = FamaFrenchFactors()

Working with the Fama-French Factors

Downloading FFF

To download the factors, use the download function. It takes the following parameters:

frequency : str, optional
    The frequency of the data. The default is "D".
factors : int, optional
    The number of factors. The default is 3. Possible values are 3 and 5
directory : str, optional
    The directory to save the file. The default is ".".
overwrite : bool, optional
    Whether to overwrite the file if it already exists. The default is False.

factors has two possible values, 3 and 5.

In [6]:
file_path = fff.download(frequency="M", factors=5, directory=".", overwrite=True)
Downloading Fama French Factors. This may take about 10 seconds.
Download complete. File saved as fff_monthly_5_factors.csv
Use load() to load the file as a pandas dataframe.

load function

Once downloaded, the fff can be loaded using the load function. The function takes the following params:

directory : str, optional
    The directory to save the file. The default is ".".
frequency : str, optional
    The frequency of the data. The default is "M".
factors : int, optional
    The number of factors. The default is 3. Possible values are 3 and 5
preprocess : bool, optional
    Whether to preprocess the data. The default is True.
In [3]:
fff5 = fff.load(frequency="M", factors=5, directory=".", preprocess=True)
/media/hari31416/Hari_SSD/Users/harik/Desktop/Finance/pystock_project/pystock/FFF.py:247: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[col] = pd.to_numeric(df[col]) * 0.01
In [4]:
fff5
Out[4]:
Mkt-RF SMB HML RMW CMA RF
1963-07-31 -0.0039 -0.0041 -0.0097 0.0068 -0.0118 0.0027
1963-08-31 0.0507 -0.0080 0.0180 0.0036 -0.0035 0.0025
1963-09-30 -0.0157 -0.0052 0.0013 -0.0071 0.0029 0.0027
1963-10-31 0.0253 -0.0139 -0.0010 0.0280 -0.0201 0.0029
1963-11-30 -0.0085 -0.0088 0.0175 -0.0051 0.0224 0.0027
... ... ... ... ... ... ...
2022-07-31 0.0957 0.0187 -0.0410 0.0068 -0.0694 0.0008
2022-08-31 -0.0377 0.0151 0.0031 -0.0480 0.0130 0.0019
2022-09-30 -0.0935 -0.0100 0.0003 -0.0150 -0.0085 0.0019
2022-10-31 0.0783 0.0187 0.0805 0.0307 0.0656 0.0023
2022-11-30 0.0460 -0.0267 0.0139 0.0602 0.0312 0.0029

713 rows × 6 columns

These factors will be used for fff3 and fff4 models later. For now, we'll have a look at some more things which you can do with the FamaFrenchFactors class.

Some More Functions

Changing the Frequncy

You can change the frequency of the factors using the change_frequency function. It takes just one parameter:

frequency : str, optional
    The frequency of the data. The default is "D".
In [5]:
fff5_quarterly = fff.change_frequency(frequency="Q")
fff5_quarterly
Out[5]:
Mkt-RF SMB HML RMW CMA RF
1963-09-30 -0.0157 -0.0052 0.0013 -0.0071 0.0029 0.0027
1963-12-31 0.0183 -0.0210 -0.0002 0.0003 -0.0007 0.0029
1964-03-31 0.0141 0.0123 0.0340 -0.0221 0.0322 0.0031
1964-06-30 0.0127 0.0029 0.0062 -0.0028 -0.0017 0.0030
1964-09-30 0.0269 -0.0034 0.0170 -0.0056 0.0062 0.0028
... ... ... ... ... ... ...
2021-09-30 -0.0437 0.0114 0.0508 -0.0190 0.0214 0.0000
2021-12-31 0.0310 -0.0077 0.0328 0.0492 0.0443 0.0001
2022-03-31 0.0305 -0.0215 -0.0180 -0.0156 0.0317 0.0001
2022-06-30 -0.0843 0.0130 -0.0597 0.0185 -0.0470 0.0006
2022-09-30 -0.0935 -0.0100 0.0003 -0.0150 -0.0085 0.0019

237 rows × 6 columns

The function changes the frequency of data inplace, meaning that if you want to upsample the data (i.e. change frequency from month to day), you will get wrong results. The function uses ffill to fill the missing values so changing frequency from month to day will result in the same value for all the days in the month.

Calculating Mean "Returns"

In Fama-French model, we'll need the mean of the columns for calculating the expected return of stock. The class provides a function to do this:

In [6]:
means = fff.calculate_mean_values()
In [7]:
means
Out[7]:
const     1.000000
Mkt-RF    0.002920
SMB       0.004762
HML       0.001381
RMW       0.003614
CMA       0.002217
RF        0.003654
dtype: float64

Note that there is an extra value named const. This is here because the Fama-French model has a constant term. Using mean in this form makes it easy to calculate the expected return of a stock.

The Stock class

Creating a Stock object

Start by loading the Stock class from the pystock module:

In [3]:
apple = Stock("AAPL", "Data/AAPL.csv")
apple
Out[3]:
Stock(name=AAPL)

Let's see what the Stock object has:

In [4]:
apple.__dict__
Out[4]:
{'name': 'AAPL',
 'directory': 'Data/AAPL.csv',
 'loaded': False,
 'return_': {},
 'fff': <pystock.FFF.FamaFrenchFactors at 0x7f173634e850>}

return_ is a dictionary which will contain the return of the stock, it can be a float (if you want mean return) or a pd.Series of floats (if you want to get the return of each day).

fff is a reference to the FamaFrenchFactors object. We will see later what it is. loaded is a boolean equal to True if the stock data has been loaded, False otherwise. Let's load the data. The load function takes a number of parameters:

start_date : str, optional
    Start date of the data, by default None
end_date : str, optional
    End date of the data, by default None
columns : list, optional
    Columns to keep, by default None which means keep all columns
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None

The function returns a pd.DataFrame with the data. Let's see what the data looks like:

load_data function

In [5]:
start_date = "2010-01-01"
end_date = "2022-12-20"
frequency = "D"
apple.load_data(start_date=start_date, end_date=end_date, frequency=frequency)
apple.__dict__.keys()
Out[5]:
dict_keys(['name', 'directory', 'loaded', 'return_', 'fff', 'data', 'columns', 'start_date', 'end_date', 'frequency'])

The Stock object now has some more attributes. data is a pd.DataFrame with the data. start_date and end_date are the start and end dates of the data. columns is a list of the columns of the data. frequency is the frequency of the data.

In [6]:
apple.data.head()
Out[6]:
Open High Low Close Adj Close Volume
Date
2010-01-04 7.622500 7.660714 7.585000 7.643214 6.515212 493729600
2010-01-05 7.664286 7.699643 7.616071 7.656429 6.526478 601904800
2010-01-06 7.656429 7.686786 7.526786 7.534643 6.422666 552160000
2010-01-07 7.562500 7.571429 7.466071 7.520714 6.410790 477131200
2010-01-08 7.510714 7.571429 7.466429 7.570714 6.453411 447610800
In [7]:
apple.loaded
Out[7]:
True

As you can see, loaded is now equal to True.

Working With Returns

Next, we'll calculate various returns using the object. For this, we have the freq_return function having the following parameters:

frequency : str, optional
    Frequency of the data, by default "M"
mean : bool, optional
    Whether to return the mean of the return, by default True
column : str, optional
    Column to calculate the return, by default "Close"
In [31]:
daily_return_series = apple.freq_return(frequency="D", mean=False)
daily_return_avg = apple.freq_return(frequency="D", mean=True)
display(daily_return_series.head())
display(daily_return_avg)
Date
2010-01-05    0.001729
2010-01-06   -0.015906
2010-01-07   -0.001849
2010-01-08    0.006648
2010-01-09    0.000000
Freq: D, Name: Close, dtype: float64
0.0007156125449657148
In [32]:
monthly_return_series = apple.freq_return(frequency="M", mean=False)
monthly_return_avg = apple.freq_return(frequency="M", mean=True)
display(monthly_return_series.head())
display(monthly_return_avg)
Date
2010-02-28    0.065396
2010-03-31    0.148470
2010-04-30    0.111021
2010-05-31   -0.016125
2010-06-30   -0.020827
Freq: M, Name: Close, dtype: float64
0.02320634467521188

These returns are saved in the return_ attribute of the object. Note that the key of the dictionary return_ is the frequency of the return. So, it will save the mean of the returns as that was what calculated last.

In [33]:
apple.return_
Out[33]:
{'D': 0.0007156125449657148, 'M': 0.02320634467521188}

Changing the frequency of the data

In [34]:
apple.frequency
Out[34]:
'D'
In [35]:
apple.data.head()
Out[35]:
Open High Low Close Adj Close Volume
Date
2010-01-04 7.622500 7.660714 7.585000 7.643214 6.515212 493729600
2010-01-05 7.664286 7.699643 7.616071 7.656429 6.526478 601904800
2010-01-06 7.656429 7.686786 7.526786 7.534643 6.422666 552160000
2010-01-07 7.562500 7.571429 7.466071 7.520714 6.410790 477131200
2010-01-08 7.510714 7.571429 7.466429 7.570714 6.453411 447610800

The data was loaded with a frequency of day. Suppose you want to change it to some other frequency. This can be done by the change_frequency function. It takes just one parameter:

frequency : str
        Frequency of the data
In [36]:
apple.change_frequency("M")
In [37]:
apple.frequency
Out[37]:
'M'
In [38]:
apple.data.head()
Out[38]:
Open High Low Close Adj Close Volume
Date
2010-01-31 7.181429 7.221429 6.794643 6.859286 5.846978 1245952400
2010-02-28 7.227857 7.327500 7.214286 7.307857 6.229348 507460800
2010-03-31 8.410357 8.450357 8.373571 8.392857 7.154222 430659600
2010-04-30 9.618214 9.663214 9.321429 9.324643 7.948493 542463600
2010-05-31 9.263929 9.264286 9.048214 9.174286 7.820327 815614800

The function changes the frequency of data inplace, meaning that if you want to upsample the data (i.e. change frequency from month to day), you will get wrong results. The function uses ffill to fill the missing values so changing frequency from month to day will result in the same value for all the days in the month.

Stock object with FamaFrenchFactors

The fff attribute of a Stock object is reference to a FamaFrenchFactors object. This object is used to get the Fama-French factors. See the corresponding section for more details. Here, we'll give a brief overview of how to use it.

download_data function

This is a wrapper function for FamaFrenchFactors.download. It takes the same parameters as FamaFrenchFactors.download (Along with some other params like load) and returns the same thing. It is used to download the data from the Fama-French website. Again, see the corresponding section for more details.

The load parameter is used to load the data into the FamaFrenchFactors object. If load is True, then the data is loaded into the fff attribute of the Stock object. If load is False, then the data is not loaded only downloaded. This is useful if you want to download the data and then load it later.

In [21]:
fff_data = apple.download_fff(frequency="D", factors=5, directory="Data", load=True)
Downloading Fama French Factors. This may take about 10 seconds.
Download complete. File saved as Data/fff_daily_5_factors.csv
Use load() to load the file as a pandas dataframe.

Since we have used load=True, the data is loaded into the fff attribute of the Stock object.

In [23]:
apple.fff.data
Out[23]:
Mkt-RF SMB HML RMW CMA RF
1963-07-01 -0.0067 0.0002 -0.0035 0.0003 0.0013 0.00012
1963-07-02 0.0079 -0.0028 0.0028 -0.0008 -0.0021 0.00012
1963-07-03 0.0063 -0.0018 -0.0010 0.0013 -0.0025 0.00012
1963-07-05 0.0040 0.0009 -0.0028 0.0007 -0.0030 0.00012
1963-07-08 -0.0063 0.0007 -0.0020 -0.0027 0.0006 0.00012
... ... ... ... ... ... ...
2022-11-23 0.0063 -0.0025 -0.0094 -0.0073 -0.0057 0.00014
2022-11-25 -0.0002 0.0027 0.0044 -0.0016 0.0014 0.00014
2022-11-28 -0.0155 -0.0047 -0.0020 0.0032 0.0031 0.00014
2022-11-29 -0.0018 0.0035 0.0103 0.0019 0.0047 0.00014
2022-11-30 0.0312 -0.0014 -0.0207 -0.0078 -0.0142 0.00014

14958 rows × 6 columns

load_fff function

This is a wrapper function for FamaFrenchFactors.load. It takes the same parameters as FamaFrenchFactors.load and returns the same thing. It is used to load the data from local if it exists.

In [9]:
apple.load_fff(frequency="D", factors=5, directory="Data")
Out[9]:
Mkt-RF SMB HML RMW CMA RF
1963-07-01 -0.0067 0.0002 -0.0035 0.0003 0.0013 0.00012
1963-07-02 0.0079 -0.0028 0.0028 -0.0008 -0.0021 0.00012
1963-07-03 0.0063 -0.0018 -0.0010 0.0013 -0.0025 0.00012
1963-07-05 0.0040 0.0009 -0.0028 0.0007 -0.0030 0.00012
1963-07-08 -0.0063 0.0007 -0.0020 -0.0027 0.0006 0.00012
... ... ... ... ... ... ...
2022-11-23 0.0063 -0.0025 -0.0094 -0.0073 -0.0057 0.00014
2022-11-25 -0.0002 0.0027 0.0044 -0.0016 0.0014 0.00014
2022-11-28 -0.0155 -0.0047 -0.0020 0.0032 0.0031 0.00014
2022-11-29 -0.0018 0.0035 0.0103 0.0019 0.0047 0.00014
2022-11-30 0.0312 -0.0014 -0.0207 -0.0078 -0.0142 0.00014

14958 rows × 6 columns

Calculating Fam-French Factors

The factors can be calculated using the calculate_fff function. It takes the following parameters:

column : str, optional
    Column to calculate the fama french factors on, by default "Close"
verbose : int, optional
    Verbosity, by default 1

The function will throw error if either the Stock or the FamaFrenchFactors object is not loaded.

In [10]:
params = apple.calculate_fff(column = "Close")
Fama French Factors Calculated
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.534
Model:                            OLS   Adj. R-squared:                  0.534
Method:                 Least Squares   F-statistic:                     744.8
Date:                Sun, 01 Jan 2023   Prob (F-statistic):               0.00
Time:                        19:20:12   Log-Likelihood:                 9671.0
No. Observations:                3250   AIC:                        -1.933e+04
Df Residuals:                    3244   BIC:                        -1.929e+04
Df Model:                           5                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0004      0.000      1.703      0.089   -5.59e-05       0.001
Mkt-RF         1.1768      0.021     56.919      0.000       1.136       1.217
SMB           -0.1769      0.039     -4.517      0.000      -0.254      -0.100
HML           -0.4995      0.037    -13.372      0.000      -0.573      -0.426
RMW            0.5945      0.052     11.371      0.000       0.492       0.697
CMA           -0.0151      0.072     -0.209      0.834      -0.157       0.127
==============================================================================
Omnibus:                      484.352   Durbin-Watson:                   1.922
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7608.523
Skew:                          -0.012   Prob(JB):                         0.00
Kurtosis:                      10.496   Cond. No.                         355.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
In [11]:
params
Out[11]:
const     0.000370
Mkt-RF    1.176796
SMB      -0.176945
HML      -0.499524
RMW       0.594542
CMA      -0.015124
dtype: float64

Portfolio class

The class represents a portfolio which has a list of stocks and a benchmark. You can also provide a weight for each stock.

Getting Started with Portfolio

To start, you have to at least provide the directory of the benchmark data as well as its name. You must also provide at least one stock. You can also provide a list of stock names and their directory. The weight can also be provided. If not provided (which defualts to "equal"), then the weight will be equal to 1/n where n is the number of stocks.

def __init__(self, benchmark_dir, benchmark_name, stocks_dir=None, stocks_name=None, weights=None):
In [2]:
benchmark_name = "S&P"
benchmark_dir = "Data/GSPC.csv"

portfolio = Portfolio(benchmark_dir=benchmark_dir, benchmark_name=benchmark_name)
portfolio
Out[2]:
Portfolio(S&P,[])
In [3]:
len(portfolio)
Out[3]:
1

The representation of portfolio shows the name of benchmark and the stocks in the portfolio. The length of the portfolio is the number of stocks in the portfolio (Including the benchmark).

Customizing the Portfolio

In [4]:
portfolio.benchmark.loaded
Out[4]:
False

Right now, portfolio has just one unloaded benchmark and no stocks. Let's load the benchmark and add a stock.

Loading The Benchmark

This can be done by using the load_benchmark function. It takes the following parameters:

start_date : str, optional
    Start date of the data, by default None
end_date : str, optional
    End date of the data, by default None
columns : list, optional
    Columns to keep, by default None which means keep all columns
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None
In [5]:
start_date = "2012-01-01"
end_date = "2022-12-20"
frequency = "D"
portfolio.load_benchmark(start_date=start_date, end_date=end_date, frequency=frequency)
In [6]:
portfolio.benchmark.loaded
Out[6]:
True
In [7]:
portfolio.benchmark.data.head()
Out[7]:
Open High Low Close Adj Close Volume
Date
2012-01-03 1258.859985 1284.619995 1258.859985 1277.060059 1277.060059 3943710000
2012-01-04 1277.030029 1278.729980 1268.099976 1277.300049 1277.300049 3592580000
2012-01-05 1277.300049 1283.050049 1265.260010 1281.060059 1281.060059 4315950000
2012-01-06 1280.930054 1281.839966 1273.339966 1277.810059 1277.810059 3656830000
2012-01-07 1280.930054 1281.839966 1273.339966 1277.810059 1277.810059 3656830000

Alternatively, you can use the Stock.load_data function to load the benchmark data since benchmark is just a Stock object.

Changing The Benchmark

You can also change the benchmark by using the change_benchmark function. It takes the following parameters:

benchmark_dir : str
    Directory of the benchmark
benchmark_name : str
    Name of the benchmark
load : bool, optional
    Load the data, by default True
use_prev : bool, optional
    Use the values of start_date, end_date, columns, frequency, rename_cols from the previous benchmark, by default True
start_date : str, optional
    Start date, by default None
end_date : str, optional
    End date, by default None
columns : list, optional
    Columns to keep, by default None
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None
In [6]:
dji_name = "Dow_Jones"
dji_dir = "Data/DJI.csv"

portfolio.change_benchmark(benchmark_dir=dji_dir, benchmark_name=dji_name, load=True, use_prev=False)
In [7]:
portfolio
Out[7]:
Portfolio(Dow_Jones,[])
In [8]:
portfolio.benchmark.loaded
Out[8]:
True

Adding A Stock

Quick Way

The class provides a function add_stocks to add a stock. It takes the following parameters:

stock_dirs : list
    List of stock directories
stock_names : list, optional
    List of stock names, by default None
load_data : bool, optional
    Whether to load the data, by default True
start_date : str, optional
    Start date, by default None
end_date : str, optional
    End date, by default None
columns : list, optional
    Columns to keep, by default None
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None
overwrite : bool, optional
    Whether to overwrite existing stocks, by default False

The quickest way to add a single or a number of stock is by passing the stock_dirs and stock_names parameter. Let's see this in action:

In [9]:
stock_names = ["AAPL"]
stock_dirs = ["Data/AAPL.csv"]

portfolio.add_stocks(stock_dirs = stock_dirs, stock_names = stock_names, load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)

If we want to add a single stock, give the name and directory of stock inside a list. This is what we have done here.

In [12]:
portfolio
Out[12]:
Portfolio(Dow_Jones,['AAPL'])

Using Stock Object

Another way is to first create the Stock object and then add it using the same method.

In [10]:
google = Stock("GOOG", "Data/GOOG.csv")
portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)
In [14]:
portfolio
Out[14]:
Portfolio(Dow_Jones,['AAPL', 'GOOG'])

Now, our portfolio has one benchmark and two stock.

In [15]:
portfolio.weights
Out[15]:
array([0.5, 0.5])

You can see that the weights has been adjusted.

Adding Multiple Stocks

For this, just pass a list of stock directories and names. The weights will be adjusted accordingly. Or you can pass a list of Stock objects.

In [11]:
stock_names = ["TSLA", "MSFT"]
stock_dirs = ["Data/TSLA.csv", "Data/MSFT.csv"]

portfolio.add_stocks(stock_dirs = stock_dirs, stock_names = stock_names, load_data=False, frequency=frequency, start_date=start_date, end_date=end_date)
In [12]:
portfolio
Out[12]:
Portfolio(Dow_Jones,['AAPL', 'GOOG', 'TSLA', 'MSFT'])

Another thing to note is that two Stocks are considered equal if they have the same name. You can not have two stocks with the same name in the portfolio. If you try to add a stock with the same name as an existing stock, then the existing stock will be overwritten or the command will be ignored depending on the value of overwrite parameter.

In [13]:
google = Stock("GOOG", "Data/GOOG.csv")

portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date, overwrite=False)
Stock GOOG already exists
You have not specified overwrite=True. Skipping...
In [14]:
portfolio
Out[14]:
Portfolio(Dow_Jones,['AAPL', 'GOOG', 'TSLA', 'MSFT'])
In [15]:
google = Stock("GOOG", "Data/GOOG.csv")

portfolio.add_stocks(stocks=[google], load_data=False, frequency=frequency, start_date=start_date, end_date=end_date, overwrite=True)
Stock GOOG already exists
Overwriting...
In [16]:
portfolio
Out[16]:
Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT', 'GOOG'])

Removing Stocks

To remove a Stock from Portfolio, use the remove_stock function. It takes the following parameters:

names : list
    A list names of the stock to remove
In [17]:
portfolio.remove_stocks(["GOOG"])
In [18]:
portfolio
Out[18]:
Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT'])

Change the Frequency of the Data

To change the frequency of the portfolio, use the change_benchmark_frequency function. It takes the following parameters:

frequency : str
    Frequency of the data
change_stocks : bool, optional
    Whether to change the frequency of the stock data, by default True
In [19]:
portfolio.benchmark.frequency
Out[19]:
'D'

However, you can change the frequency only if you have loaded the data. If you have not loaded the data, then the function will throw an error.

In [21]:
portfolio.change_benchmark_frequency("M")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[21], line 1
----> 1 portfolio.change_benchmark_frequency("M")

File /media/hari31416/Hari_SSD/Users/harik/Desktop/Finance/pystock_project/pystock/portfolio.py:769, in Portfolio.change_benchmark_frequency(self, frequency, change_stocks)
    767 if change_stocks:
    768     for stock in self.stocks:
--> 769         stock.change_frequency(frequency)

File /media/hari31416/Hari_SSD/Users/harik/Desktop/Finance/pystock_project/pystock/portfolio.py:178, in Stock.change_frequency(self, frequency)
    169 """
    170 Changes the frequency of the data.
    171 
   (...)
    175     Frequency of the data
    176 """
    177 self.frequency = frequency
--> 178 self.data = self.data.asfreq(frequency, "ffill").dropna()

AttributeError: 'Stock' object has no attribute 'data'

Loading data will be covered in the next section. For now, as benchmark is already loaded, we will change the frequency of the benchmark.

In [23]:
portfolio.change_benchmark_frequency("M", change_stocks=False)
In [24]:
portfolio.benchmark.frequency
Out[24]:
'M'

Although you can get away with changing the frequency of the benchmark only, it is recommended to change the frequency of the stock data as well.

Loading the Data

Many times, when we try to run some function, you will get an exception telling that "'Stock' object has no attribute 'data'". This happens because the Stock is not loaded yet as you can check by using the loaded attribute of the Stock object.

In [30]:
for stock, name in portfolio:
    print(name, stock.loaded)
Dow_Jones True
AAPL False
TSLA False
MSFT False

We see that no stock data is loaded. Let's load the data.

ou can use the Portfolio as an iterator. Some more details about these special methods will be covered later.

There are mainly three functions to load data. We already discussed the load_benchmark function. Other two are discussed below.

load_one_stock

As the name suggests, this loads data of one stock specified by the name parameter. The function is built on Stock.load_data function. It takes the following parameters:

name : str
    Name of the stock
start_date : str, optional
    Start date, by default None
end_date : str, optional
    End date, by default None
columns : list, optional
    Columns to keep, by default None
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None
overwrite : bool, optional
    Whether to overwrite existing data, by default False
In [31]:
apple_data = portfolio.load_one_stock("AAPL", frequency=frequency, start_date=start_date, end_date=end_date)
In [32]:
for stock, name in portfolio:
    print(name, stock.loaded)
Dow_Jones True
AAPL True
TSLA False
MSFT False

The data of APPL is now loaded. We get some more attributes by loading the data. See the Stock class for more details.

load_all

As the name suggests, this loads data of all the stocks in the portfolio. It takes the following parameters:

start_date : str, optional
    Start date, by default None
end_date : str, optional
    End date, by default None
columns : list, optional
    Columns to keep, by default None
frequency : str, optional
    Frequency of the data, by default "D"
rename_cols : list, optional
    Columns to rename, by default None
overwrite : bool, optional
    Whether to overwrite existing data, by default False

Previously, we just loaded the apple data, now we'll load all the data.

In [33]:
portfolio.load_all(frequency=frequency, start_date=start_date, end_date=end_date)
In [34]:
for stock, name in portfolio:
    print(name, stock.loaded)
Dow_Jones True
AAPL True
TSLA True
MSFT True

Let's see the data of these stocks.

In [36]:
portfolio["AAPL"].data.head()
Out[36]:
Open High Low Close Adj Close Volume
Date
2012-01-03 14.621429 14.732143 14.607143 14.686786 12.519279 302220800
2012-01-04 14.642857 14.810000 14.617143 14.765714 12.586560 260022000
2012-01-05 14.819643 14.948214 14.738214 14.929643 12.726295 271269600
2012-01-06 14.991786 15.098214 14.972143 15.085714 12.859330 318292800
2012-01-07 14.991786 15.098214 14.972143 15.085714 12.859330 318292800
In [37]:
portfolio["TSLA"].data.head()
Out[37]:
Open High Low Close Adj Close Volume
Date
2012-01-03 1.929333 1.966667 1.843333 1.872000 1.872000 13921500
2012-01-04 1.880667 1.911333 1.833333 1.847333 1.847333 9451500
2012-01-05 1.850667 1.862000 1.790000 1.808000 1.808000 15082500
2012-01-06 1.813333 1.852667 1.760667 1.794000 1.794000 14794500
2012-01-07 1.813333 1.852667 1.760667 1.794000 1.794000 14794500

Special Methods

Representation

The Portfolio object has implemented the __repr__ method which lets it represent the object in understandable manner.

In [93]:
portfolio
Out[93]:
Portfolio(Dow_Jones,['AAPL', 'TSLA', 'MSFT'])

You can see that the represenation of Portfolio has the name of the benchmark and the list of the stocks. This lets us have a "peek" at the portfolio!

String

You can "print" the Portfolio and it will give a peek of the portfolio:

In [95]:
print(portfolio)
Portfolio with benchmark Dow_Jones and stocks ['AAPL', 'TSLA', 'MSFT']

Using the in Keyword

The Portfolio class implements the __contains__ special method. This makes it easy to use the in keyword to check if a Stock is in the Portfolio. Use the stock name or the Stock object.

In [89]:
"AAPL" in portfolio, "TCS" in portfolio
Out[89]:
(True, False)
In [90]:
portfolio.stocks[0] in portfolio
Out[90]:
True
In [91]:
portfolio.benchmark in portfolio
Out[91]:
True

Using Subscriptation

You can use the name of the stock to get the Stock from the Portfolio object:

In [92]:
portfolio["AAPL"]
Out[92]:
Stock(name=AAPL)
In [94]:
portfolio["Dow_Jones"]
Out[94]:
Stock(name=Dow_Jones)

Iteration

You can iterate over the Portfolio.

In [97]:
for stock, name in portfolio:
    print(stock.name, name)
Dow_Jones Dow_Jones
AAPL AAPL
TSLA TSLA
MSFT MSFT

The Portfolio iterator yields the Stock and name of the stock. Note that the first entry is that of the benchmark.

You can use the list constructor to create a list of stock and names:

In [98]:
list(portfolio)
Out[98]:
[(Stock(name=Dow_Jones), 'Dow_Jones'),
 (Stock(name=AAPL), 'AAPL'),
 (Stock(name=TSLA), 'TSLA'),
 (Stock(name=MSFT), 'MSFT')]

Merging

Merging is necessary for calculating various stock parameters used in the portfolio optimization models. For this reason, we have a couple of methods.

Merging With the Benchmark

This is necessary for calculating $\alpha$ and $\beta$ parameters. This is realized by using the function merge_stock_with_benchmark

In [38]:
merged = portfolio.merge_stock_with_benchmark("AAPL")
In [40]:
merged.head()
Out[40]:
Dow_Jones_Open Dow_Jones_High Dow_Jones_Low Dow_Jones_Close Dow_Jones_Adj Close Dow_Jones_Volume AAPL_Open AAPL_High AAPL_Low AAPL_Close AAPL_Adj Close AAPL_Volume
Date
2012-01-31 12632.900391 12632.900391 12632.900391 12632.900391 12632.900391 0 16.271070 16.365713 16.181070 16.302856 13.896848 391683600
2012-02-29 12952.099609 12952.099609 12952.099609 12952.099609 12952.099609 0 19.341429 19.557501 19.132143 19.372856 16.513771 952011200
2012-03-31 13212.000000 13212.000000 13212.000000 13212.000000 13212.000000 0 21.741785 21.805714 21.355000 21.412500 18.252399 731038000
2012-04-30 13213.599609 13213.599609 13213.599609 13213.599609 13213.599609 0 21.350000 21.371429 20.821428 20.856428 17.778395 506144800
2012-05-31 12393.500000 12393.500000 12393.500000 12393.500000 12393.500000 0 20.740713 20.767857 20.409286 20.633215 17.588120 491674400

When merging, it is recommended that you use just those columns which will be required later. Usually the column "Close" is the only one which is useful so it is good idea to use just this column while calling load_all method.

Merge Everything

Use the merge_all function for this. This merges all the stocks with benchmark. Note that all stocks must be loaded.

In [41]:
merged_all = portfolio.merge_all()
In [42]:
merged_all.columns
Out[42]:
Index(['Dow_Jones_Open', 'Dow_Jones_High', 'Dow_Jones_Low', 'Dow_Jones_Close',
       'Dow_Jones_Adj Close', 'Dow_Jones_Volume', 'AAPL_Open', 'AAPL_High',
       'AAPL_Low', 'AAPL_Close', 'AAPL_Adj Close', 'AAPL_Volume', 'TSLA_Open',
       'TSLA_High', 'TSLA_Low', 'TSLA_Close', 'TSLA_Adj Close', 'TSLA_Volume',
       'MSFT_Open', 'MSFT_High', 'MSFT_Low', 'MSFT_Close', 'MSFT_Adj Close',
       'MSFT_Volume'],
      dtype='object')

Since we used all the columns while loading, after the merge_all, you get huge number of columns.

Returning the Return

Return of a stock is its one of the most important feature. The Portfolio class provides a number of way to get this.

Of course, you can get the return by calling the methods inbuilt in the Stock object. Here, we'll discuss methods of the Portfolio object.

Both of these methods as well as most of the method discussed below takes a parameter column dictating which column to use while calculating the corresponding values. The default is "Close" and you should not change this. An exception is when you want to use the "Adj. Close". However, in that case too, it is recommended that you change the column name from "Adj. Close" to "Close" while loading the data.

Return of A Single Stock

This can be determined using the get_stock_return method. As usualy, pass the name of the stock. The method also taked a frequency parameter.

In [62]:
apple_return, apple_std = portfolio.get_stock_return("AAPL")
In [65]:
apple_return, apple_std
Out[65]:
(0.02034460996923982, 0.08107760447739829)

The methods in this object are implemented to give an average return. If you want to get a series of return, use the methods of the Stock object.

Return of All Stocks

Use the get_all_stock_returns function!

In [66]:
monthly_returns = portfolio.get_all_stock_returns()
monthly_returns
Out[66]:
Stock Monthly_Mean_Return Monthly_Return_STD
0 AAPL 0.020345 0.081078
1 TSLA 0.050261 0.182803
2 MSFT 0.018516 0.060729

What About the Whole Portfolio?

Well, you can use the portfolio_return method to get this. The function gives a weighted return. You can also specify the weights.

In [68]:
portfolio_return_equal, _ = portfolio.portfolio_return()
portfolio_return_equal
Out[68]:
0.029707288531672652
In [70]:
portfolio_return_just_apple, _ = portfolio.portfolio_return(weights=[1,0,0])
portfolio_return_just_apple
Out[70]:
0.02034460996923982

Calculating the Stock Parameters

alpha and beta

These two parameters are required for the CAPM and SIM models. There are two methods for calculating this:

get_stock_params

This function returns the parameters for one stock identified by the name of the stock.

In [47]:
tesla_alpha, tesla_beta = portfolio.get_stock_params("TSLA")
In [48]:
print(tesla_alpha, tesla_beta)
0.04086664252883307 1.713736456801432

get_all_stock_params

This returns parameters for all the stocks in the portfolio.

In [54]:
alpha_beta_all = portfolio.get_all_stock_params(return_dict=False, column="Close")
In [55]:
alpha_beta_all
Out[55]:
Stock Alpha Beta
0 AAPL 0.013671 0.976056
1 TSLA 0.040867 1.713736
2 MSFT 0.013301 0.858644

After using get_all_stock_params method, the alpha and beta of a stock can also be accessed thorugh the attribute of that stock.

In [59]:
for stock in portfolio.stocks:
    print(stock.name, stock.alpha, stock.beta)
AAPL 0.013670770282234219 0.9760561904155619
TSLA 0.04086664252883307 1.713736456801432
MSFT 0.013301247231778873 0.8586435486025358

The parameters can also be accessed directly from the Portfolio:

In [60]:
portfolio.alphas, portfolio.betas
Out[60]:
([0.013670770282234219, 0.04086664252883307, 0.013301247231778873],
 [0.9760561904155619, 1.713736456801432, 0.8586435486025358])

Summary!

Portfolio object has a summary method which gives summary of the portfolio. The method requires frequency, weights and column:

In [86]:
portfolio.summary()
Portfolio Summary
*****************

Portfolio with benchmark Dow_Jones and stocks ['AAPL', 'TSLA', 'MSFT']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+-----------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |     Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+-----------+----------+----------|
|  0 | AAPL    |             0.0203446 |            0.0810776 | 0.0136708 | 0.976056 | 0.333333 |
|  1 | TSLA    |             0.0502608 |            0.182803  | 0.0408666 | 1.71374  | 0.333333 |
|  2 | MSFT    |             0.0185164 |            0.0607293 | 0.0133012 | 0.858644 | 0.333333 |
+----+---------+-----------------------+----------------------+-----------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|      |       AAPL |       TSLA |       MSFT |
|------+------------+------------+------------|
| AAPL | 0.00657358 | 0.00564335 | 0.00248329 |
| TSLA | 0.00564335 | 0.033417   | 0.00397697 |
| MSFT | 0.00248329 | 0.00397697 | 0.00368805 |
+------+------------+------------+------------+
Portfolio Return: 0.029707288531672652
Portfolio Volatility: 0.18991518028969553

If you are feeling lazy and don't want to call a couple of methods to calculate the return, alpha and beta, you can just vcall the summary method and it calculates all the values under the hood!

The calculation of FFF parameters, however, is not included in the summary method. The reason is that calculations of FFF parameters are a bit involved and unless you want to optimize portfolio using the fff3 or fff5 model, you don't even need to do the calculations of FFF parameters.

FFF Parameters

To use the Fama–French three-factor model or five factor model, you need the three or five parameters. As usual, we have two methods to do this:

calculate_fff_params_one

This calculates the FFF params for the given stock. You can pass the name of the stock or the stock itself. The function uses the Stock.load_fff method to load the FFF data. See the corresponding section for more detail.

In [79]:
apple_fff5 = portfolio.calculate_fff_params_one("AAPL", frequency="M", factors=5, directory="Data")
In [75]:
apple_fff5
Out[75]:
const     0.005514
Mkt-RF    1.197614
SMB      -0.258640
HML      -0.513560
RMW       0.750830
CMA      -0.181558
rf        1.000000
dtype: float64

calculate_fff_params

You already know what this method does!

In [80]:
all_ff5 = portfolio.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)
Done. Here are the parameters
+-------------+------------+-------------+
|        AAPL |       TSLA |        MSFT |
|-------------+------------+-------------|
|  0.00551378 |  0.0341999 |  0.00794491 |
|  1.19761    |  1.87135   |  0.993504   |
| -0.25864    | -0.379263  | -0.787259   |
| -0.51356    | -0.610234  |  0.0173204  |
|  0.75083    | -1.45574   | -0.100924   |
| -0.181558   | -0.497976  | -0.529687   |
|  1          |  1         |  1          |
+-------------+------------+-------------+

One you have calculated the fff parameters, you can access this with the params attribute of Stock object.

In [82]:
portfolio["AAPL"].params
Out[82]:
const     0.005514
Mkt-RF    1.197614
SMB      -0.258640
HML      -0.513560
RMW       0.750830
CMA      -0.181558
rf        1.000000
dtype: float64

Or use the stock_params attribute of the Portfolio object:

In [84]:
portfolio.stock_params
Out[84]:
{'AAPL': const     0.005514
 Mkt-RF    1.197614
 SMB      -0.258640
 HML      -0.513560
 RMW       0.750830
 CMA      -0.181558
 rf        1.000000
 dtype: float64,
 'TSLA': const     0.034200
 Mkt-RF    1.871348
 SMB      -0.379263
 HML      -0.610234
 RMW      -1.455743
 CMA      -0.497976
 rf        1.000000
 dtype: float64,
 'MSFT': const     0.007945
 Mkt-RF    0.993504
 SMB      -0.787259
 HML       0.017320
 RMW      -0.100924
 CMA      -0.529687
 rf        1.000000
 dtype: float64}

The Mean Values

These values are required while calculating the expected stock return using fff3 or fff5 method. If you have called calculate_fff_params_one or calculate_fff_params method, yoy don't need to do anything else. The mean values have been calculated and can be accessed by mean_values attribute. If you have not called at least one of these methods, well, call it!

In [85]:
portfolio.mean_values
Out[85]:
const     1.000000
Mkt-RF    0.005592
SMB       0.002236
HML       0.003101
RMW       0.002819
CMA       0.002955
RF        0.003621
dtype: float64

Model Class

This class has methods to optimize the portfolio. The class is build on top of the Portfolio class. Let's get started!

Getting Started With Model

Let's instantiate the model:

In [3]:
model = Model("M")

The only parameters which the Model expects are the frequency and risk_free_rate.

Creating a Portfolio

The easiest way to get started with Model is by using the create_portfolio method. This method creates a portfolio by using the benchmark_dir, benchmark_name, stock_dirs, and stock_names. The method accepts some other parameters which are necessary to create a Portfolio.

In [5]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
start_date = "2012-01-01"
end_date = "2022-12-20"

portfolio = model.create_portfolio(
    benchmark_dir=benchmark_dir,
    benchmark_name=benchmark_name,
    stock_dirs=stock_dirs,
    stock_names=stock_names,
    frequency=frequency,
    start_date=start_date,
    end_date=end_date
)
Loading benchmark...
Loading stocks...
Calculating other results...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |       TSLA |
|------+------------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 | 0.00566592 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 | 0.00399494 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 | 0.00347788 |
| TSLA | 0.00566592 | 0.00399494 | 0.00347788 | 0.033417   |
+------+------------+------------+------------+------------+
Portfolio Return: 0.02731678889316875
Portfolio Volatility: 0.15603147059665842

The create_portfolio method returns the Portfolio object. It does all the work of loading the data, merging the data and calculating the parameters. If your goal is to optimize portfolio using capm or sim model, you don't need to do anything else. Just call the optimize_portfolio method.

This method, by default, loads just the "Adj. Close" column and renames it to "Close" column.

Though this method is enough for many works, it is not recommended way to use the module. You should create a Portfolio object and then use other method to add it to the Model object.

Adding a Portfolio

Start by creating a Portfolio object. Then, use the add_portfolio method to add it to the Model object.

In [7]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)

Let's print the portfolio summary:

In [8]:
pt.summary()
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |       TSLA |
|------+------------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 | 0.00566592 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 | 0.00399494 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 | 0.00347788 |
| TSLA | 0.00566592 | 0.00399494 | 0.00347788 | 0.033417   |
+------+------------+------------+------------+------------+
Portfolio Return: 0.02731678889316875
Portfolio Volatility: 0.15603147059665842

Note that you need to calculate the FFF parameters explicitly if you want to use the FFF models. Let's do that:

In [9]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)
Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+

Great! Now you can optimize the portfolio. But there is another method which we need to discuss.

Updating a Portfolio

The Model object accepts just one Portfolio. You can update the portfolio with another one:

In [10]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv"]
stock_names = ["AAPL", "MSFT", "GOOG"]

frequency = "M"
pt2 = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt2.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt2.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
In [11]:
model.portfolio
Out[11]:
Portfolio(S&P,['AAPL', 'MSFT', 'GOOG', 'TSLA'])
In [14]:
model.update_portfolio(pt2, weights="equal")
Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  | 0.333333 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 | 0.333333 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  | 0.333333 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |
|------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 |
+------+------------+------------+------------+
Portfolio Return: 0.019668774918203367
Portfolio Volatility: 0.15155859087536744
In [15]:
model.portfolio
Out[15]:
Portfolio(S&P,['AAPL', 'MSFT', 'GOOG'])

The function calls the Portfolio.summary() method to make the model ready for optimization.

The load_portfolio Function

Suppose yoy created a Portfolio but have not loaded the data yet. You then add this to the Model by setting the portfolio attribute. You can use Model.portfolio attribute to load the data of benchmark and stocks, or, you can use the load_portfolio method which does all this.

In [19]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv"]
stock_names = ["AAPL", "MSFT", "GOOG"]

frequency = "M"
pt2 = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"

model = Model("M")

model.portfolio = pt2
In [20]:
model.portfolio
Out[20]:
Portfolio(S&P,['AAPL', 'MSFT', 'GOOG'])
In [21]:
model.load_portfolio(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
Loading benchmark...
Loading stocks...
Calculating other results...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  | 0.333333 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 | 0.333333 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  | 0.333333 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |
|------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 |
+------+------------+------------+------------+
Portfolio Return: 0.019668774918203367
Portfolio Volatility: 0.15155859087536744

Optimization

Before optimizing the portfolio, suppose you want to try some weights and see how the return and risk is changing. Or you just want to see expected return of a stock based on its calculated parameters. For this the Model has some methods. Let's create a model:

In [24]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
In [25]:
model = Model()
model.add_portfolio(pt, weights="equal")
Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |       TSLA |
|------+------------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 | 0.00566592 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 | 0.00399494 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 | 0.00347788 |
| TSLA | 0.00566592 | 0.00399494 | 0.00347788 | 0.033417   |
+------+------------+------------+------------+------------+
Portfolio Return: 0.02731678889316875
Portfolio Volatility: 0.15603147059665842

We won't calculate the FFF parameters just yet.

expected_return_of_stock

This function returns what the name says:

In [5]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="capm")
exp_return
Warning. FFF params have not been calculated. Using ff3 or ff5 model will result in error.
Out[5]:
1.1054838377097305
In [6]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="sim")
exp_return
Warning. FFF params have not been calculated. Using ff3 or ff5 model will result in error.
Out[6]:
1.115282131986719

Returns by the capm and sim models are almost same. Let's try FFF models. As the warning message says, we have to first do the FFF calculations.

In [26]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)
Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+
In [8]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="fff5")
exp_return
Out[8]:
1.65139661893182
In [9]:
exp_return = model.expected_return_of_stock(pt["AAPL"], model="fff3")
exp_return
Out[9]:
1.499680184234527

So, the FFF models predict higher returns!

portfolio_info

This method returns the expected value and risk of portfolio given the weights and model:

In [14]:
model.portfolio
Out[14]:
Portfolio(S&P,['AAPL', 'MSFT', 'GOOG', 'TSLA'])
In [18]:
weights = "equal"
model_ = "capm"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")
Expected Return: 1.11%
Expected Variance: 0.55
In [19]:
weights = "equal"
model_ = "fff5"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")
Expected Return: 2.11%
Expected Variance: 0.55
In [20]:
weights = [0.2, 0.2, 0.2, 0.4]
model_ = "fff5"
exp_return, variance, _ = model.portfolio_info(weights=weights, model=model_)
print(f"Expected Return: {exp_return:.2f}%")
print(f"Expected Variance: {variance:.2f}")
Expected Return: 2.49%
Expected Variance: 0.86

Plotting the Portfolio Frontier

If you have just two stocks in your portfolio, you can use the portfolio_frontier method to plot the portfolio frontier with a model.

In [27]:
pt.remove_stocks(["TSLA", "MSFT"])
In [28]:
model.portfolio
Out[28]:
Portfolio(S&P,['AAPL', 'GOOG'])

As you have deleted two stocks, you need to call summary again to recalculate the params.

In [29]:
pt.summary()
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'GOOG']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+---------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |    Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+---------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897 |      0.5 |
|  1 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298 |      0.5 |
+----+---------+-----------------------+----------------------+------------+---------+----------+
The covariance matrix is as follows
+------+------------+------------+
|      |       AAPL |       GOOG |
|------+------------+------------|
| AAPL | 0.00662438 | 0.00215109 |
| GOOG | 0.00215109 | 0.00416849 |
+------+------------+------------+
Portfolio Return: 0.019379990857807596
Portfolio Volatility: 0.19101971223336575

You will also need to delete the series of calculted FFF params for these stocks.

In [30]:
del pt.stock_params["MSFT"]
del pt.stock_params["TSLA"]
In [31]:
model.portfolio.stock_params
Out[31]:
{'AAPL': const     0.006827
 Mkt-RF    1.197356
 SMB      -0.249423
 HML      -0.512566
 RMW       0.749702
 CMA      -0.201821
 rf        1.000000
 dtype: float64,
 'GOOG': const     0.006991
 Mkt-RF    1.010230
 SMB      -0.572982
 HML       0.199086
 RMW      -0.107732
 CMA      -0.867854
 rf        1.000000
 dtype: float64}
In [32]:
model.portfolio_frontier(model="capm")
In [33]:
model.portfolio_frontier(model="sim")
In [34]:
model.portfolio_frontier(model="fff3")
In [35]:
model.portfolio_frontier(model="fff5")

The fff3 is coming out to be very different.

Optimization (at last!)

Okay, let's optimize the following Portfolio:

In [39]:
benchmark_dir = "Data/GSPC.csv"
benchmark_name = "S&P"

stock_dirs = ["Data/AAPL.csv", "Data/MSFT.csv", "Data/GOOG.csv", "Data/TSLA.csv"]
stock_names = ["AAPL", "MSFT", "GOOG", "TSLA"]

frequency = "M"
pt = Portfolio(benchmark_dir, benchmark_name, stock_dirs, stock_names)
start_date = "2012-01-01"
end_date = "2022-12-20"
pt.load_benchmark(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
pt.load_all(
    columns=["Adj Close"],
    rename_cols=["Close"],
    start_date=start_date,
    end_date=end_date,
    frequency=frequency,
)
In [40]:
model = Model()
model.add_portfolio(pt, weights="equal")
Adding portfolio...
Portfolio Summary
*****************

Portfolio with benchmark S&P and stocks ['AAPL', 'MSFT', 'GOOG', 'TSLA']
Here are the summary of stocks in the portfolio
+----+---------+-----------------------+----------------------+------------+----------+----------+
|    | Stock   |   Monthly_Mean_Return |   Monthly_Return_STD |      Alpha |     Beta |   Weight |
|----+---------+-----------------------+----------------------+------------+----------+----------|
|  0 | AAPL    |             0.0216164 |            0.0813903 | 0.00979829 | 1.22897  |     0.25 |
|  1 | MSFT    |             0.0202463 |            0.0607759 | 0.010973   | 0.964346 |     0.25 |
|  2 | GOOG    |             0.0171436 |            0.0645638 | 0.00711404 | 1.04298  |     0.25 |
|  3 | TSLA    |             0.0502608 |            0.182803  | 0.0336704  | 1.72526  |     0.25 |
+----+---------+-----------------------+----------------------+------------+----------+----------+
The covariance matrix is as follows
+------+------------+------------+------------+------------+
|      |       AAPL |       MSFT |       GOOG |       TSLA |
|------+------------+------------+------------+------------|
| AAPL | 0.00662438 | 0.00252134 | 0.00215109 | 0.00566592 |
| MSFT | 0.00252134 | 0.00369371 | 0.00230756 | 0.00399494 |
| GOOG | 0.00215109 | 0.00230756 | 0.00416849 | 0.00347788 |
| TSLA | 0.00566592 | 0.00399494 | 0.00347788 | 0.033417   |
+------+------------+------------+------------+------------+
Portfolio Return: 0.02731678889316875
Portfolio Volatility: 0.15603147059665842
In [41]:
pt.calculate_fff_params(frequency="M", factors=5, directory="Data", verbose=0)
Done. Here are the parameters
+-------------+-------------+-------------+------------+
|        AAPL |        MSFT |        GOOG |       TSLA |
|-------------+-------------+-------------+------------|
|  0.00682682 |  0.00970946 |  0.00699103 |  0.0341999 |
|  1.19736    |  0.994033   |  1.01023    |  1.87135   |
| -0.249423   | -0.776582   | -0.572982   | -0.379263  |
| -0.512566   |  0.0184882  |  0.199086   | -0.610234  |
|  0.749702   | -0.103125   | -0.107732   | -1.45574   |
| -0.201821   | -0.548657   | -0.867854   | -0.497976  |
|  1          |  1          |  1          |  1         |
+-------------+-------------+-------------+------------+

All set! All you need now is to call optimize_portfolio with model, risk and can_short parameter. You may call the portfolio_info first with default parameters. This will give you an idea about how much risk to consider.

In [42]:
model.portfolio_info()
Out[42]:
(1.112657603301926, 0.550881209951722, 0.742213722556867)

It seems that the variance of the Portfolio with "equal" weights is 0.551. Let's see what is the maximum return at that risk.

In [47]:
def get_return(risk, can_short):
    models = ["capm", "sim", "fff3", "fff5"]
    for m in models:
        print(f"Optimizing for -> {m.upper()}")
        _ = model.optimize_portfolio(m, risk=risk, can_short=can_short)
        print()
In [48]:
risk = 0.5
can_short = False
get_return(risk, can_short)
Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.1155%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 47.20%
MSFT: 0.00%
GOOG: 36.08%
TSLA: 16.73%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.1283%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 46.20%
MSFT: 0.00%
GOOG: 36.81%
TSLA: 17.00%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 2.3360%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 0.00%
MSFT: 53.08%
GOOG: 23.86%
TSLA: 23.06%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.0628%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 15.42%
MSFT: 53.74%
GOOG: 9.06%
TSLA: 21.78%

So, FFF3 model gives the best return of 2.336% for the weights:

AAPL: 0.00%
MSFT: 53.08%
GOOG: 23.86%
TSLA: 23.06%

Let's allow shorting:

In [49]:
risk = 0.5
can_short = True
get_return(risk, can_short)
Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.1165%
Variance: 0.5001%
Expected weights:
--------------------
AAPL: 48.81%
MSFT: -8.60%
GOOG: 44.26%
TSLA: 15.53%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.1288%
Variance: 0.5001%
Expected weights:
--------------------
AAPL: 47.40%
MSFT: -5.97%
GOOG: 42.38%
TSLA: 16.19%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 2.3391%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: -5.51%
MSFT: 56.67%
GOOG: 25.98%
TSLA: 22.86%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.0628%
Variance: 0.5000%
Expected weights:
--------------------
AAPL: 15.42%
MSFT: 53.74%
GOOG: 9.06%
TSLA: 21.78%

Very little increase in maximum return in observed (2.339%) for

AAPL: -5.51%
MSFT: 56.67%
GOOG: 25.98%
TSLA: 22.86%

Let's increase the risk to 1:

In [50]:
risk = 1
can_short = False
get_return(risk, can_short)
Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.2233%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 62.21%
MSFT: 0.00%
GOOG: 0.00%
TSLA: 37.79%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.2421%
Variance: 1.0001%
Expected weights:
--------------------
AAPL: 61.20%
MSFT: 0.00%
GOOG: 0.74%
TSLA: 38.06%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 3.0121%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 0.00%
MSFT: 47.74%
GOOG: 6.33%
TSLA: 45.92%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.6580%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 10.72%
MSFT: 44.08%
GOOG: 0.00%
TSLA: 45.20%

Maximum return is increased (at it should be). The new maximum return is 3.0121% for

AAPL: 0.00%
MSFT: 47.74%
GOOG: 6.33%
TSLA: 45.92%
In [51]:
risk = 1
can_short = True
get_return(risk, can_short)
Optimizing for -> CAPM
Optimized successfully.
Expected return: 1.2434%
Variance: 1.0001%
Expected weights:
--------------------
AAPL: 76.10%
MSFT: -57.41%
GOOG: 49.23%
TSLA: 32.08%

Optimizing for -> SIM
Optimized successfully.
Expected return: 1.2591%
Variance: 0.9999%
Expected weights:
--------------------
AAPL: 73.48%
MSFT: -52.53%
GOOG: 45.76%
TSLA: 33.30%

Optimizing for -> FFF3
Optimized successfully.
Expected return: 3.0453%
Variance: 0.9999%
Expected weights:
--------------------
AAPL: -24.65%
MSFT: 63.64%
GOOG: 15.33%
TSLA: 45.68%

Optimizing for -> FFF5
Optimized successfully.
Expected return: 2.6661%
Variance: 1.0000%
Expected weights:
--------------------
AAPL: 14.17%
MSFT: 58.19%
GOOG: -16.04%
TSLA: 43.68%

Allowing for short does not have very large effect.

At last, we'll consider a very small risk.

In [52]:
risk = 0.1
can_short = False
get_return(risk, can_short)
Optimizing for -> CAPM
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 0.9828%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> SIM
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 0.9921%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> FFF3
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 1.6268%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

Optimizing for -> FFF5
Optimization failed. Positive directional derivative for linesearch
Here are the last results:
Expected return: 1.4503%
Variance: 0.2992%
Expected weights:
--------------------
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%

The model can not optimize for this low risk. The best result is is:

Expected return: 1.6268%
Variance: 0.2992%
Expected weights:
AAPL: 14.90%
MSFT: 47.07%
GOOG: 38.03%
TSLA: 0.00%